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Abstract — in this paper we show that it is possible to 
drastically reduce the fluorescence and shot noises in Raman 
spectra by means of a mathematical method which uses 
morphological and moving averages filters. Proved on biological 
materials, simulations show that after the application of the 
morphological and the moving average filters on the raw Raman 
data, the line shape is basically free of fluorescence background, 
free of shot noise and that the peaks keep their original 
characteristics as position, intensity, area and relative intensity 
among peaks. In addition to this, the low intensity peaks masked 
by the mentioned noises are defined after the operations. The 
noise removal method reported in this paper may have an 
immediate application in improving the spectroscopic processes 
for food quality assurance and for the screening of metabolites 
in human samples. 

Index Terms — near infrared Raman spectroscopy, noise 
removal, morphological filter, moving averages filter, biological 
samples 


I. INTRODUCTION 

Raman spectroscopy has proven to be an extremely capable 
tool to characterize semiconductors [1] and identify crystal 
pigments [2-7] among other possibilities [8-10]. The small 
signal obtained in a standard Raman measurement however, 
diminishes not only the efficiency of the technique, but it can 
become in a major drawback in the data analysis. Background 
noises as shot and those due to the instruments may limit the 
detection of the Raman signal too as fluorescence does. The 
case of fluorescence; which is the most common source of the 
background noise in Raman spectroscopy, is very significant 
by consider that it can be intrinsic to the studied sample. As 
the fluorescence is usually some orders of magnitude higher 
than the Raman signal, it could easily compromise the data 
analysis if it masks some of the characteristic bands 
associated to the material. Due to this, the presence of 
fluorescence in a Raman spectrum generally reduces the 
effective observation of the useful signal. Both shot and 
fluorescence noises urge then a reduction of their intensities 
as a need for a clear identification and later characterization of 
the bands of the studied samples by Raman spectroscopy. 
This is of paramount importance when the Raman 
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spectroscopy is applied to study biomolecules samples which 
intrinsic shot and fluorescence noises may mask the useful 
signal. Mathematical and instrumentation methods have been 
proposed in order to eliminate the fluorescence level; but for 
our knowledge, this has not been done to eliminate the shot 
noise; and less in the same proposal. 

All morphological operations are basically the result of one 
or more operations of union or intersection among others 
between two sets of data X, Y, which pertain to a Z space. In 
the operation between two sets of data X and Y for example, 
one of the sets (Y for instance) is designated as the structuring 
element, which will operate with the X set. Due to this, the 
structuring element is shifted through to the Z space. As a 
result, the operation transforms the data of X into another set. 
The transformation results help to visualize unique geometric 
structures in the original data set X using the structuring 
element Y in the study of the data related to any sample. The 
shape of Y is chosen a priori according with the morphology 
of the set to be transformed and with the special structures 
which will be extracted. This way, mathematical morphology 
has been used lately as a nonlinear processing technique 
mainly in digital image processing. The only requirement for 
its application, involves the use of sets of data and their 
properties [11-13]. In any case, the mathematical morphology 
is based on a classical theory [14]. 

Erosion and dilation are the most basic morphological 
operations and they are the basis of any morphological 
transformation. The expression in one dimension for the 
erosion of a function f (x) by a structuring element Y is 
defined as the minimum value of the function in the window 
determined by the structuring element centered at x. The 
expression in one dimension for the dilation of a 
function f (x) by a structuring element Y on the other hand, 
is defined as the maximum value of the function in the 
window determined by the structuring element also centered 
at x. The erosion and dilation functions are transformations 
with no inverse, this way; the original signal cannot be 
recovered from them. The original signal however can be 
approximated using erosion followed by dilation with the 
same structuring element. This morphological operation is 
called opening. A dilation followed by erosion is a 
morphological closing operation. Both opening and closing 
are the basic morphological operations to remove noise. 
Opening removes small features while closing removes small 
holes into which a specific structural element can fit. The 
opening of a function is obtained by the erosion of the 
function by Y following by the dilation of the resulting 
function after the erosion and is described mathematically as 
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0 Y f(x) = S Y (e Y f(x)) (l). 


Where S Y f (x) and S Y f(x ) are the erosion and dilation 
functions respectively. The result is a nonlinear smoothing of 
the signal by removing the positive peaks narrower than the 
structuring element. That is, the opening always takes values 
lower than the original set and can approximate the baseline 
of a spectrum. A morphological operation called top-hat 
transformation allows determining the structures eliminated 
by the opening operation. Mathematically it can be described 
like this 

thf(x) = f(x) - 0 Y f(x) (2). 

Where thf(x) is the top-hat transformation; / (x) is the 
function related to the X data set and 0 Y f (x) is the opening 
function. The top-hat operation can be used to obtain a 
baseline-free spectrum. 

Moving averages is the underlying concept behind 
the trading indicators and it has remained unaltered for more 
than a half of century [15-18]. The development in this field 
has consisted basically in proposing new ad-hoc rules and 
using more elaborate types of moving averages in the existing 
rules because it still gives reliable results to the present day. 
Working with past and current market prices, trading volume 
and other public information, the moving average technique is 
widely used to predict market prices. The performance of any 
moving average depends exclusively on the shape of the 
weighting function N. A N-interval Moving Average (MA) at 
interval-end f is computed as: 

(n 0 ,n, ,...n f ) 

MAf(N) = ' f (3). 

N 

Where are the most recent observations of the 

closed interval; N is the weighting function and f is the 
interval-end. 

In this paper we report on the results of the application of a 
mathematical two steps method (implemented in a homemade 
program which uses morphological and moving average 
operations together) in order to remove the fluorescence and 
shot noises from the Raman spectra of biological samples. 
Our results show that the method not only leaves unchanged 
the characteristics of the line shape of the Raman signal as 
position, intensity, area and relative intensity among peaks 
position; but drastically reduces the fluorescence and the 
characteristic ripple of the shot noise. The results reported in 
this work show the utility of both morphological and moving 
average filters together in the removing of intrinsic noises of 
biological samples. The noise removal method reported in 
this paper may have an immediate application in improving 
the spectroscopic processes for food quality assurance and for 
the screening of metabolites in human samples. 

II. EXPERIMENTAL 

The samples used in this work were: a sample of fresh 
carrot, a commercial tomato sauce, a gray human hair, and a 
sample of human serum. In the three first cases, the samples 


were taken without any preparation. The sample of the human 
serum was obtained from a patient clinical diagnosed with 
corneal ulcer treated in the Institute of Ophthalmology Conde 
de Valenciana, Mexico City, Mexico. Upon obtaining the 
blood sample, it was centrifuged at 3500 rpm for 10 minutes. 
After this, the sample was frozen in an ultra-freezer at -50°C 
for preservation. For the Raman measurements, the sample 
was passively thawed at room temperature. The Raman 
spectra of the samples were measured using a micro-Raman 
system (RanishawlOOOB) with a back scattering geometry 
[19] with a 600 lines/mm grating and a CCD camera 
(RemCaml024x256 pixels). The wavelength used in the 
system is of 830 nm and its laser beam with a spot-size of 
about 2mm was focused onto the sample with a 50xobjective 
of a Leica (DMLM) microscope. To reduce the intrinsic 
noises of fluorescence and shot from the raw spectra, all data 
were treated with a home-made program which uses the 
morphological and moving average operations indicated in 
the second and third paragraphs of the results and discussions 
of this paper. In all the cases, the home-made program was run 
on the MatLab platform. Prior to the measurement, the Raman 
system was calibrated with the 520 cm" 1 Raman peak of a 
silicon semiconductor. All signals were taken under the same 
experimental conditions with an exposure of 60 seconds. 


III. RESULTS AND DISCUSSION 
As it was previously stated, the small signal obtained in a 
Raman standard measurement is basically on a fluorescence 
signal usually some orders of magnitude higher than the 
Raman signal. Due to this, the useful signal may easily be 
mask for the fluorescence. This in turn makes the 
characterization a very complicated task. This is worst if the 
Raman signal has an additional high content of other noises: 
say shot, atmospheric, etc. This is the reason why any new 
method to clean the signal is of paramount importance. With a 
useful signal free of noises the data analysis is highly 
simplified. And in any sense, a clear identification of the 
characteristic bands of a material will increase the 
effectiveness of the technique itself to apply it for example in 
improving the spectroscopic processes for food quality 
assurance and the screening of metabolites in human sera 
samples in a first approach. 

The foundations of the mathematical filters used in this 
work for the suppression of fluorescence and shot noises in 
the Raman spectra of biological samples uses an averages 
opening operation as 

oy(4 _ 4o r /W]+g[o y /W] (4) . 

This gives a closer baseline in the regions where there are 
Raman bands. After that, the filter uses an expression like 

0 bp f(x) = min[0f(x),0 Y f(x)\ (5), 

To obtain the best approach of the baseline; the baseline is 
removed by applying the top-hat transformation which gives 
the correction for the opening operation as it was proved in 
reference [20], 
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For the removal of the shot noise we employed a moving 
average operation like 

.,. Z(n most recent data values ) 

MA = - L rfi'i 


In equation (6), N is the basis for the moving averages. 
Although there is no specific rule on how to select the basis of 
the moving average N, it is recommended a large N when the 
behavior of the data is relatively stable over time. 
Conversely, if the variable shows changing patterns; it is 
advisable to use a small N value. In practice, values for N 
between 2 tolO are normal. In this work the N value was equal 
to 5. These operations were implemented in the mentioned 
homemade program for simulation. 

Just to compare and to show the effect of the mentioned 
filters on the raw data of our experiment, we start our 
discussion with the Raman spectrum of a carrot sample. 
Figure 1(a) shows the raw Raman spectrum as was obtained 
following the directives indicating in experimental section. 
Figure 1(b), shows the effect of the morphological filter on the 
raw data of the Raman spectrum. Figure 1(c), shows the effect 
of the moving averages filter on the results obtained after the 
morphological filter on the original data of the Raman 
spectrum of the carrot sample. 


I 


Carrot sample 



Fig. 1(a) Raw Raman spectrum of a carrot sample as was 
obtained following the directives indicating in 
experimental section. 

In Figure 1(a) as it can be seen, the Raman signal is 
drastically affected by a background of a lower frequency 
signal of fluorescence with a very high level of intensity 
compared with the Raman signal; and a very high frequency 
shot noise. It is clear in this figure, that the fluorescence level 
almost masks the signal of interest. It also hinders the proper 
characterization of the sample bands. Due to this, the 
maximums of the three wide bands suggested in the fine shape 
of figure 1 (a) cannot be visualized accurately. The noise level 
in the region between 1100 and 1550 cm" 1 on the other hand 
could avoid a clear identification of the two higher peaks 
located in such region. 
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Fig. 1(b) Raman spectrum after the morphological filter 
on the raw data of a carrot sample obtained as indicated 
in the experimental section. 

In Figure 1(b) as it can be seen, the fluorescence has been 
reduced several orders of magnitude. Because of this, the shot 
noise is much more noticeable and their intensities along the 
entire region, as it can be seen, completely mask even the 
three wide bands suggested in the raw Raman spectrum. 
Particularly, the noise intensity between 1100 and 1550 cm" 1 
make harder the identification of the two higher peaks located 
in such region; however, such peaks still keep the same 
position. The results in figure 1(b) show that the 
morphological filter is not enough to suppress the unwanted 
signals in the raw data. This is particularly true in the study of 
biological samples as the one discussed in this section, which 
has a complicated intrinsic noise family. 
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Fig.l(c) Raman spectrum after the moving averages filter 
on the results obtained after the morphological filter on 
the raw Raman spectrum of carrot sample. 

In Figure 1(c) as it can be seen, the application of the 
moving averages filter on the data obtained after the 
morphological filter; reduces drastically the high content of 
the shot noise. The treatment of the signal after the 
morphological filter with a moving averages operation as the 
one used in this work leaves, as it can be seen in figure 1(c), 
the Raman signal of the biological sample almost without 
noise. After the moving averages filter it is clear that the 
higher peaks indicated with arrows in figure 1 (b) are due to 
the material; and that there were not three wide bands as 
indicated in figure 1 (a) but five due to the material in the 
region between 305 and 1000 cm" 1 . Figure 1(c) also shows a 
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triplet of low intensity peaks around the peak located around 
1518 cm" 1 , which were completely masked even after the 
morphological filter. In conclusion, the moving averages filter 
has uncovered the Raman signal of the carrot sample which 
mainly includes the five wide bands in the region between 305 
and 1000 cm" 1 and the higher peaks located between 1100 and 
1550 cm" 1 . After the morphological and moving averages 
filters on the raw Raman data it is easy to identify in figure 
1(c) the bands due to carotene located at 1153 and 1518 cm' 1 
which are in agreement with reference [21], Possibly also, the 
lycopene located at 1002 cm" 1 in agreement with reference 
[23] and pectin in accordance with [24] located at 374, and 
852 cm" 1 . These results clearly show that the combination of 
the morphological and moving average operations applied to 
the raw Raman data of a carrot sample step by step, is a 
reliable method for the suppression of fluorescence and shot 
noises in the Raman signal of biological samples; taking into 
account that such operations keep the position, area, intensity, 
and relative intensity of the sample Raman peaks. Our results 
not only suggest that the mayor drawback of the Raman 
signal, the fluorescence, can be eliminated from the raw 
spectra obtained on biological samples, but that the shot noise 
and those due to the instruments; which also may limit the use 
of the technique, can be suppressed too in an automatic two 
steps procedure. This in turns becomes a mayor advantage in 
the study of biological materials. Our results show that the 
noise removal method reported in this paper, may have an 
immediate application in improving the spectroscopic 
processes for food quality assurance. 

In order to complete our analysis and to prove the 
efficiency of the suppression filters used in this work, we 
measured the Raman spectra of a commercial ketchup sauce; 
a human grey hair and a human serum samples. Figure 2 
shows the raw Raman spectrum of the commercial ketchup 
sauce with filled black dots; the data after the morphological 
filter with empty circles; and the data after the morphological 
and moving average filters with a solid line. In this figure as it 
can be seen, we have again a lower frequency signal of 
fluorescence with a very strong level of intensity compared 
with the Raman signal. The logic consequence of such high 
offset is the masking of the structure of the material in the 
spectrum. Under these circumstances, is easy to see the 
necessity of an urgent treatment to reduce the fluorescence at 
least. In this case, the results of the application of the first step 
of our method to suppress the fluorescence level in the raw 
commercial ketchup sauce spectrum are seen in figure 2. The 
empty circles in this figure show that after the morphological 
filter, the strong fluorescence level has been reduced again 
several orders of magnitude; and its consequences are clearly 
shown with the appearance of the characteristic structure of 
the spectrum. The results of the morphological filter however, 
still show the presence of shot noise. This is particularly 
evident in the region between 750 and 1100 cm" 1 . Due to this, 
it was applied again the moving averages filter to the results 
obtained after the morphological filter. Figure 2 show with a 
solid line the results after the morphological and moving 
averages filters on the raw commercial ketchup sauce 
spectrum. Our results show again that after the morphological 
and the moving average filters, the strong fluorescence level 
and the shot noise have been suppressed in a way that the 
characteristic structure of the spectrum is clearly seen. The 
solid line in this figure clearly show the Raman bands of the 
tomato puree related to the lycopene at 1508 and 1003 cm" 1 ; 


the one related to [3-carotene at 1155 cm" 1 [21]; and possible 
also the peak related to glucose at 417cm" 1 [22], These results 
confirm that our noise removal method may have an 
immediate application in improving the spectroscopic 
processes for food quality assurance. 
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Fig. 2 Raw Raman spectrum of a commercial ketchup 
sauce (black dots). Raman spectrum of a commercial 
ketchup sauce, after the morphological operations (open 
circles). Raman spectrum after the moving averages 
operation on the results obtained after the morphological 
operations on the raw Raman spectrum of a commercial 
ketchup sauce (solid line). 



GRAY HUMAN HAIR 
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Fig. 3 Raw Raman spectrum of a gray human hair sample 
(black dots). Raman spectrum of a gray human hair 
sample, after the morphological operations (open circles). 
Raman spectrum after the moving averages operation on 
the results obtained after the morphological operations 
on the raw Raman spectrum of a gray human hair sample 
(solid line). 


Figure 3 shows the raw Raman spectrum of a gray human 
hair with filled black dots; the data after the morphological 
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filter with empty circles; and the data after the morphological 
and moving average filters with a solid line. In this case, as 
shown in the line shape of the tilled black dots; although the 
fluorescence level allows seeing the structure of the material, 
there is still a problem for an appropriate characterization. 
This is due to the so poor resolution of such line shape. This is 
a particular problem in the region centered at 1000 cm" 1 where 
the bands are not clearly resolved on the picture of the raw 
data. This is evident after the application of the morphological 
filter to the raw data to suppress the fluorescence level. The 
effect of the morphological filter on the raw data as it can be 
seen, not only reduces drastically the fluorescence level, but 
leaves in sight a very sharp peak and an asymmetric band 
clearly defined in the region between the 900 and 1000 cm' 1 . 

Our results show again that the consequences of this filter, 
define completely the structure of the gray human hair 
spectrum. This effect also shows the invariance in the location 
of the peaks after the application of the morphological filter to 
the raw data, which is one of the mayor features of our method 
in the removing of undesirable noises from the useful signal. 
The shot noise however, still remains after the application of 
the morphological filter. This is particularly evident in the 
region between 600 and 1800 cm" 1 , where the shot noise has a 
significant level and may become a problem for proper 
characterization. These results highlight again the need for the 
moving averages filter even in the cases where the 
morphological operations show the structure of the material. 
The solid line in figure 3 shows our results after the 
morphological and the moving average filters on the raw data 
of the gray human hair sample. This solid fine shows the 
complexity of the gray human hair spectrum. In this solid line 
shape it is possible to identify the Cysteine at 513 and 668 
cm" 1 ; the amide I at 934 and 1651 cm' 1 ; the amide III at 1245 
cm" 1 and the phenylalanine around 1002 cm _1 [25]. These 
results suggest that the application of the method may also be 
useful for the screening of the metabolites in samples of 
human origin. 

In a first approach, the human serum complex structure can 
be divided into two regions: region A from 1500 to 600cm" 1 
and region B from 600 to 375cm" 1 . Region A has a large 
number of Raman bands which can be assigned to fats, amino 
acids, primary metabolites, glucose and others [26-27]. This 
situation of a mix of analytes embedded in a matrix containing 
other components, which have their own signals and their 
bands spliced on each other; complicate the clear 
identification of a particular analyte in region A. Region B has 
a less influence of the other analytes mixed in the matrix and it 
contains the biggest plasma carbohydrate [28]. However, 
even with a virtually null influence of the other components of 
the mix, a high content of noises in the Raman signal is always 
a problem for a proper characterization. 

Figure 4 shows the raw Raman spectrum of a human serum 
sample with filled black dots; the data after the morphological 
filter with empty circles; and the data after the morphological 
and moving average filters with a solid fine. In this case, as 
shown in the line shape of the filled black dots; the 
fluorescence level completely masks the real structure of the 
human serum within region A. Such structure however, is 
completely exposed after the application of the morphological 
filter, leaving in view its complexity. 


HUMAN SERUM 



Raman shift (cm" 1 ) 


Fig. 4 Raw Raman spectrum of a human serum sample 
(black dots). Raman spectrum of a human serum sample, 
after the morphological operations (open circles). Raman 
spectrum after the moving averages operation on the 
results obtained after the morphological operations on 
the raw Raman spectrum of a human serum sample (solid 
line). 

The effect of the morphological filter on the raw data as it 
can be seen in this case again, not only reduces the 
fluorescence level several orders of magnitude, but leaves in 
sight the very sharp peak related to the phenylalanine and 
some asymmetric bands split in doublets and triplets in the 
region between 700 and 1300 cm" 1 . Our results also suggest in 
this case the invariance of the peaks form and positions with 
the application of the morphological filter to the raw data. 
Such filter however, as in the other cases described in 
previous paragraphs, leave the shot noise level in the whole 
domain of the studied region. These results again, highlight 
the need for the moving averages filter. The solid line in 
figure 4 shows our results after the morphological and the 
moving average filters on the raw data of a very complicated 
fine shape as the one related to the human serum sample. In 
this solid line is easy to see that both regions A and B of the 
complex human serum spectrum have been revealed after the 
morphological and the moving averages filters. Such filters 
have left a line shape free of fluorescence background, free of 
noise and with its original line shape. These results are clearly 
remarkable in the great definition of the phenylalanine, which 
is located around 1000 cm" 1 and characterize the human 
serum. Our results also suggest that after the human serum 
signal treatment with our method, the two peaks located in the 
region between 375 and 600 cm" 1 may be become clear 
candidates for the glucose identification in human serum [22, 
29] after the noise reduction. This a logical conclusion 
considering the less influence of the other analytes of serum 
the mix; and the clear definition obtained after the treatment 
of the signal with our method. This results confirm that our 
method maybe useful for the screening of metabolites in 
human samples. 

IV. CONCLUSIONS 

A two steps automated method to suppress the fluorescence 
and shot noises in the Raman spectra has been implemented in 
a homemade program and proved on biological materials 
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spectra. Based on morphological and moving averages 
operations, the method seems to be efficient. Simulations 
show that after the application of the morphological and the 
moving average operations on the raw Raman data, the line 
shape is free of fluorescence background, free of shot noise 
and that the peaks keep their original form such as position, 
area, intensity, and relative intensity among peaks. In the 
addition to this, the method can resolve the low intensity 
peaks which could be masked due to the fluorescence and the 
shot noises. Our results suggest that the method may have an 
immediate application in improving the spectroscopic 
processes for food quality assurance and for the screening of 
metabolites in human samples. 
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